  PaRSEC is a generic framework for architecture aware scheduling and 
management of micro-tasks on distributed many-core heterogeneous 
architectures. Applications are expressed as a Direct Acyclic Graph 
of tasks with labeled edges designating data dependencies. PaRSEC 
assigns computation threads to the cores, overlaps communications 
and computations between nodes as well as between host and 
accelerators (like GPUs). It achieves these features by using a 
dynamic, fully-distributed scheduler based on architectural features 
such as NUMA nodes and GPU awareness, as well as algorithmic features 
such as data reuse.

Requires an MPI implementation either MPICH2 or OpenMPI.

Optional requirements:
 - hwloc (autodetect)
 - PAPI (autodetect)
 - NVidia CUDA (autodetect)
